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ABSTRACT 

Subtraction of astrophysical foreground contamination from "dirty" sky maps produced by sim- 
ulated measurements of the Murchison Widefield Array (MWA) has been performed by fitting a 
3'''^-order polynomial along the spectral dimension of each pixel in the data cubes. The simulations 
are the first to include the unavoidable instrumental effects of the frequency-dependent primary an- 
tenna beams and synthesized array beams. They recover the one-dimensional spherically-binned input 
redshifted 21 cm power spectrum to within ^ 1% over the scales probed most sensitively by the MWA 
(0.01 ^ k < I Mpc~^) and demonstrate that realistic instrumental effects will not mask the EoR 
signal. We find that the weighting function used to produce the dirty sky maps from the gridded 
visibility measurements is important to the success of the technique. Uniform weighting of the vis- 
ibility measurements produces the best results, whereas natural weighting significantly worsens the 
foreground subtraction by coupling structure in the density of the visibility measurements to spectral 
structure in the dirty sky map data cube. The extremely dense Mi^-coverage of the MWA was found 
to be advantageous for this technique and produced very good results on scales corresponding to 
\u\ < 500A in the Mw-plane without any selective editing of the uw-coverage. 

Subject headings: Cosmology: Early Universe, Galaxies: Intergalactic Medium, Radio Lines: General, 
Techniques: Interferometric, Methods: Data Analysis 



1. INTRODUCTION 

Astrophysical foreground contaminants are five orders 
of magnitude brighter than the ~ 10 mK redshifted 
21 cm emission expected from neutral hydrogen in the in- 
tergalactic medium (IGM) during the epoch of reioniza- 
tion (EoR). These foregrounds will severely complicate 
planned experiments and the interpretation of their re- 
sults. Galactic synchrotron radiation dominates the sky 
at radio frequencies near 150 MHz (z « 8), accounting 
for > 70% of the 200 to 10,000 K total brightness tem- 
perature (Shaver et al. 1999). Extragalactic continuum 
point sources are also especially strong and numerous, 
presenting a sea of confused point sources and compris- 
ing the bulk of the remaining ^ 30% of the sky brightness 
temperature. Galactic radio recombination lines (RRLs) 
and free-free emission from electrons in both the Galaxy 
and the IGM will additionally complicate the planned 
measurements . 

Initial analyses indicated that these foregrounds were 
an insurmountable obstacle (Di Matteo et al. 2002; Oh 
& Mack 2003) because their angular variance domi- 
nates the expected fiuctuations in the redshifted 21 cm 
background, but subsequent studies have suggested that 
multi-frequency observations and the application of ap- 
propriate statistical techniques should provide methods 
to separate the foregrounds from the redshifted 21 cm sig- 
nal by exploiting the large coherence of the foregrounds 
with frequency (Di Matteo et al. 2004; Zaldarriaga et al. 
2004; Morales & Hewitt 2004; Furlanetto & Briggs 2004; 
Gnedin & Shaver 2004; Santos et al. 2005; Wang et al. 
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2006; McQuinn et al. 2006; Gleser et al. 2008). How- 
ever, these studies have relied on a number simplify- 
ing assumptions, and in particular have neglected com- 
plications due to the frequency-dependent instrumen- 
tal response. The point spread function and instru- 
mental field-of-view are both frequency dependent, and 
can mix the angular structure that concerned Di Mat- 
teo et al. (2002) and Oh & Mack (2003) into the fre- 
quency direction, masking the EoR signal. This transfer 
of power from the angular to frequency dimensions has 
been dubbed "mode-mixing" , and is a significant concern 
for redshifted 21 cm measurements. 

In this paper we explicitly model the mode-mixing ef- 
fect for confusion level contaminants and explore ways 
of minimizing this power transfer into the frequency do- 
main. With these new techniques we show that, for the 
Murchison Widefield Array (MWA), contamination due 
to mode-mixing can be reduced well below that of the 
expected 21 cm signal strength during the rcionization 
epoch for a reasonable foreground model. We begin in 
§2 by defining our instrument and sky models and estab- 
lishing the mathematical foundation of the subtraction 
technique. In §3 we analyze the ability of the method to 
produce a model of the confusion-level foreground con- 
tamination and, in §4, we show that the input 21 cm 
power spectrum in our simulations can be recovered to 
within ~ 1% (excluding thermal noise uncertainty) over 
most scales in the planned MWA measurements. We 
conclude in §5 with a brief discussion and an analytic 
approximation to provide context to the results of the 
simulation. 

A detailed study of the dependence of this foreground 
subtraction technique on the properties of the instru- 
ment and astrophysical sky has been conducted in par- 
allel by Liu et al. (2008) for cases building on our fidu- 
cial MWA measurement framework. In addition, Jclic 



et al. (2008) have recently developed a sophisticated as- 
trophysical foreground model and applied it to analysis 
of subtraction techniques in the context of the expected 
antenna configuration of the Low Frequency Array (LO- 
FAR). Their results indicate that LOFAR should be able 
to constrain the reionization history over a wide range of 
redshifts by measuring the variance in image maps from 
individual spectral channels between 100 and 200 MHz. 
Our present focus is on foreground subtraction in the 
context of three-dimensional power spectrum measure- 
ments including the full frequency-dependent instrumen- 
tal properties of the MWA, but many of the results of all 
three investigations are relevant to both measurements 
and instruments, as well as other active redshiftcd 21 cm 
experiments including the Precision Array to Probe the 
Epoch of Reionization (PAPER), the reionization project 
with the Giant Metre- wave Radio Telescope (GMRT), 
and the future Square Kilometer Array (SKA). 

2. STATEMENT OF THE PROBLEM 

Galactic synchrotron radiation, extragalactic contin- 
uum sources, and free-free emission dominate brightness 
temperature maps of the low-frequency radio sky and 
yield diffuse angular structure that is several orders of 
magnitude more intense than the expected redshifted 
21 cm background from the reionization epoch and Dark 
Ages. The fundamental problem faced by all redshiftcd 
21 cm experiments, therefore, is how to effectively isolate 
the desired signal at high precision from these nuisance 
foregrounds. The strategies discussed in the literature 
for removing the foreground contamination all exploit a 
single, common spectral property that distinguishes the 
foregrounds from the expected signal. Along any one line 
of sight, each of the foreground components has a slowly 
varying power-law-like spectrum. This results in a large 
spectral coherence scale and is in contrast to the red- 
shiftcd 21 cm signal from the reionization epoch, which 
fluctuates relatively rapidly in all three spatial dimen- 
sions, and thus has a short coherence scale, both in fre- 
quency and angle (Morales & Hewitt 2004). In general, 
the spatial coherence length of the reionization signal is 
of order 10 Mpc, which translates to sub-degree scale 
fluctuations on the plane of the sky and sub-MHz fluc- 
tuations in frequency. Thus, although the angular fluc- 
tuations in the foreground intensity span the scales of 
interest for the 21 cm signal, the frequency fluctuations 
are on much larger scales than the desired signal. 

In the image domain, the "per pixel" technique of sub- 
tracting a smooth spectral component from each pixel 
in a data cube should be able to effectively remove the 
foreground contribution (Furlanetto & Briggs 2004; San- 
tos et al. 2005; Wang et al. 2006; McQuinn et al. 2006) 
and only sacrifice a few of the largest modes of the red- 
shifted 21 cm fluctuations. A closely related approach is 
to produce internal linear combinations (ILCs) of maps 
from different spectral channels (e.g. difference maps) 
in analogy with the removal of Galactic emission from 
CMB measurements (Gnedin & Shaver 2004; Di Mat- 
teo et al. 2004). Unlike in CMB studies, however, tem- 
plate fitting of specific foreground components is not a 
viable approach for redshifted 21 cm foreground subtrac- 
tion due to the extremely high level of accuracy needed 
and the scarcity of suitable templates at the target fre- 
quencies. In the Fourier domain, one can also fit low 



order polynomials or derive ILCs to aid in removing fore- 
ground emission, and this has certain advantages for in- 
struments with sparse u, v coverage (Zaldarriaga et al. 
2004; Ue-li Pen personal communication). In all of these 
approaches and the additional statistical techniques dis- 
cussed in Morales et al. (2006), the angular 21 cm fluctu- 
ations are effectively ignored in favor of the line-of-sight 
fluctuations. 

There is a significant obstacle to implementing the 
techniques listed above in real- world applications: invert- 
ing the instrumental response matrix for the MWA and 
other low-frequency arrays will not be possible. Thus, 
the effects of their frequency-dependent point spread 
functions (PSFs) and fields-of-view (FOVs) will remain 
in the final data products and mix the strong angular 
foreground fluctuations into the frequency domain. This 
instrumental effect can produce sufficient spectral struc- 
ture in the derived brightness temperature maps to de- 
stroy the large spectral-coherence of the foregrounds and 
permanently mask the 21 cm signal. Even a modest ob- 
servational bandwidth of 30 MHz at 150 MHz {z « 8) 
represents a fractional bandwidth of 20%. Thus, the ar- 
ray appears 20% larger in the uu-plane at the highest fre- 
quencies than the lowest, causing the PSF in uncorrected 
maps to change in angular size by 20% from the one end 
of the frequency band to the other. The FOV is similarly 
20% larger at the bottom of the band than the top.^ Con- 
sequently, ripples and peaks introduced into the image 
by the instrumental PSF scale with frequency and ap- 
pear in different locations at different frequencies. This 
is the mechanism that couples angular fluctuations into 
the frequency domain, and is shown graphically in Figure 
1. Similarly, the varying FOV signiflcantly changes the 
input sky signal received at different frequencies. This 
is particularly important for Fourier domain subtraction 
techniques, where a 20% change in FOV implies that 
~^40% of the signal is not in common for a visibility mea- 
sured at the same point in the uw-plane, but at opposite 
ends of the band. 

In order to study the effect of mode-mixing, we need to 
carefully simulate the instrument, foreground contamina- 
tion, and analysis pipeline. In the following sections we 
describe the simulation parameters used for this study. 

2.1. Epoch of Reionization Observations with the 
Murchison Widefield Array (MWA) 

The fiducial MWA design is described in Bowman et al. 
(2006, Their Section 2). The array design consists of 
N — 500 antennas distributed within & D = 1500 m 
diameter circle. The density of antennas as a function of 



radius is taken to go as ' 



■ , but capped at a maximum 



density of one antenna per 36 vc? . The antenna response 
is approximated by 
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where Op is proportional to wavelength and is 31° at 
i^ = 158 MHz. The angular resolution of the array is 
given by X/D and the total collecting area by TV dA^ 
where dA is the collecting area of each antenna and scales 

^ One can design antennas with a frequency-dependent collecting 
area to counteract the effect of the aperture becoming smaller in 
wavelengths and produce a near constant FOV. However, for a 
number of technical reasons, none of the upcoming arrays use this 
kind of detector element. 



165 
160 
155 
150 
145 
140 
135. 



I 



10 



15 



20 25 

-resolution elements 



30 



35 



40 



J 

1 
1 

] 

I 

-J 

45 



TABLE 1 

MWA Design Specifications 



Parameter 


Value 


Array layout, p(r) (m^^) 


^r-2 


Array diameter, D (m) 


1500 


Array latitude (°) 


-27 


Bandwidth, B (MHz) 


32 


Spectral resolution (kHz) 


40 


Number of antennas, N 


500 


Angular resolution, Og (°) 


0.073 


Antenna collecting area, dA (m^) 


14.4 


Antenna response scale, ©p (°) 


31 


Central frequency, uo (MHz) 


158 


System temperature, Tgys (K) 


440 



Fig. 1. — Cartoon of a frequency vs. position plot to illustrate the origin of mode-mixing from a frequency dependent point spread function 
(PSF). At the origin of the horizontal axis a source has been mis-subtracted, and the residual flux ripples across the image due to the array's 
point spread function. Since the PSF scales with frequency, the positions of the radial intensity peaks change with frequency resulting 
in diagonal frequency bands. At linos of sight away from the mis-subtracted source, this leads to frequency-dependent contamination, as 
indicated by the contribution along the vertical dashed lines. The frequency-dependent PSF is inherent in the measurement, and mixes 
a pure spatial component (mis-subtracted sources) into the frequency direction. In addition, as the frequency slope of the contamination 
depends on the distance to the residual sources, a wide range of linc-of-sight fe-modes arc contaminated. 

the MWA to a high degree of precision (better than 1 
part in 10^) in order to reveal the fluctuations in the 
redshifted 21 cm background. For typical interferome- 
ters, the sparse coverage of visibility measurements in 
the uw-plane means that a substantial amount of power 
from these point sources is spread across the derived map 
of the sky due to the side lobes of the point spread func- 
tion of the synthesized beam. This poses a considerable 
deconvolution problem and makes it difficult to subtract 
them from the measurements since the locations and in- 
tensities of the point sources must be determined very 
accurately before their contributions to the individual 
visibility measurements can be removed to high preci- 
sion. The large number of antennas used for the MWA 
reduces the side lobes of the synthesized beam and pro- 
vides high angular dynamic range in uncorrected maps of 
the sky. This is advantageous for isolating and subtract- 
ing the power from extremely intense extragalactic con- 
tinuum sources and alleviates the severity of the bright 
source contamination. 

2 - Diffuse spectral subtraction. The next step is 
to remove diffuse emission from the Galaxy and faint 
confusion-level sources by fitting a polynomial to the 
spectrum along each line of sight in the observed sky 
maps in order to subtract the spectrally smooth fore- 
ground component. To aid this process, the angular size 
of the primary antenna beam (field of view) for the MWA 
has been matched to fit within the relatively cold re- 
gions that exist in the Galactic synchrotron emission at 
high Galactic latitudes. Figure 2 illustrates the typical 
brightness temperature of the synchrotron foreground at 
150 MHz, as well as the primary field of view for an MWA 
antenna tile that is targeting a cold part of the sky. 

3 - Statistical template subtraction. The final step in 
the foreground subtraction process is envisioned to take 
place as part of the 21 cm power spectrum analysis by 
fitting templates of the residual statistical structures of 
foregrounds that may remain even after the first-order 
subtraction of their emission from sky maps. 

In the analysis that follows, we focus only on the sec- 
ond step: subtracting faint sources and diffuse emission 
that cannot be identified as individual contributions in 
the maps and deconvolved efficiently. We do not model 
the first step of bright source deconvolution, but rather 
assume it has been performed perfectly. At the end of 



Note. — Parameters are listed with their 
fiducial values at t' = 158 MHz, correspond- 
ing to 2 = 8 for redshifted 21 cm measure- 
ments. 

like dA — 16(A^/4) for A < 2.1 m and is capped for longer 
wavelengths. Finally, the full instantaneous bandwidth 
of the instrument is i? = 32 MHz and the spectral reso- 
lution is 10 kHz, although the FOR observations will be 
binned to > 40 kHz resolution to reduce the data volume. 
All of the fiducial properties are summarized in Table 1. 
For the analysis in this paper, we define the observation 
to be of a single field with 360 hours integration during 
the most favorable circumstances. Additionally, we set 
the frequency coverage to 142 < v < 174 MHz, which 
spans 9 > z > 7.1. 

The foreground subtraction strategy planned for red- 
shifted 21 cm measurements with the MWA is a multi- 
stage process consisting of three primary components: 

1 - Bright source subtraction. The first step is to sub- 
tract individual bright sources using a full deconvolu- 
tion approach that incorporates the position, frequency, 
and time dependent antenna calibrations to remove the 
sources to high precision. The MWA has been designed 
to aid in the mitigation of extremely intense extragalactic 
and Galactic continuum sources. It is common practice 
in radio astronomy to "peel" away bright sources in the 
field of view to improve the overall dynamic range in de- 
rived maps of the sky brightness and, thus, to increase 
the sensitivity to fainter signals. This process must be 
applied to the bright point sources in measurements by 



this paper, we look briefly at the implications from the 
present analysis for the final planned foreground subtrac- 
tion step of statistical template fitting. 

2.2. The Sky Model 

In this section, we specify the details of the sky model 
for the astrophysical foregrounds and the redshifted 21 
cm signal. We build our foreground sky model adhering 
closely to existing empirical findings. The underlying 
physical mechanisms behind the observed properties are 
not addressed in the development of this model since we 
are interested primarily in the functional consequences 
of their manifestations on the sky and in the instrument 
(for an alternative treatment see Jelic et al. (2008)). We 
begin by constructing an empirically-motivated model of 
the astrophysical foregrounds that consists of two compo- 
nents: 1) discrete, faint extragalactic continuum sources 
and 2) diffuse Galactic synchrotron emission. Together, 
these are expected to account for approximately 98% of 
the total intensity in the radio spectrum below 200 MHz 
(Shaver et al. 1999; Bridle 1967). We exclude free-free 
emissions as a separate component in our analysis since 
they have power-law spectra similar to the other compo- 
nents (Kogut et al. 1996) and they are easily subsumed 
by the uncertainty in the discrete continuum source con- 
tribution, although the free-free emissions have been 
shown to correlate with dust clouds at high Galactic lat- 
itudes, and thus have different angular structure than 
the discrete source population. We have also chosen to 
ignore supernova relics, radio clusters, and RRLs in this 
analysis. The first two categories of sources do not differ 
substantially from the power-law-like components we in- 
clude in the model, and the frequencies at which Galactic 
RRLs occur are known and can be excised from obser- 
vations if they are determined to be a significant contri- 
bution. Diffuse RRLs have never been observed at high 
Galactic latitudes, but they are expected to be narrower 
than the 40 kHz frequency channel size of the MWA and 
occur approximately every few MHz, and therefore in 
only a few percent of the spectral channels [Erickson et al. 
(1995), Shaver (1975) and Walmsley & Watson (1982), 
are good starting points for additional details about the 
observed properties and theoretical treatment of RRLs, 
respectively]. Thus, the cost of excising the RRLs is a 
minor complication to the window function, but one that 
is expected to be no more severe than that caused by ex- 
cising radio- frequency interference (RFI). 

2.2.1. Diffuse Galactic Synchrotron Radiation 

For the diffuse Galactic synchrotron foreground, there 
are a number of observations on which to base the model 
(Bridle 1967; Landecker & Wielebinski 1970; Haslam 
et al. 1982; Reich & Reich 1988; Alvarez et al. 1997; 
Roger et al. 1999). The spectrum of the Galactic syn- 
chrotron emission has been found to be a nearly fea- 
tureless power-law with modest variations in the spec- 
tral index as a function of direction and frequency. At 
the frequencies of interest, the spectral index is given by 
T ^ v^^ ^ and has a typical value of /3 « 2.5 (Rogers & 
Bowman 2008). In general, it is steeper at high Galactic 
latitudes than toward the Galactic plane. The spectral 
index also steepens as a function of frequency to a maxi- 
mum of /3 w 2.8 by i^ « 1 GHz, but is generally constant 
below about 200 MHz. The variation of the spectral 



index across the sky has a standard deviation of order 
(7/3 K, 0.1 on degree scales (Bridle 1967; Reich & Reich 
1988; Platania et al. 1998; Roger et al. 1999) and ap- 
pears to be weakly correlated to the angular structure 
(Reich & Reich 1986; Lawson et al. 1987; Platania et al. 
2003). Angular structure in the diffuse Galactic emis- 
sion has been shown to be well-described by a power-law 
spectrum in Fourier space over a large range of scales. 
The angular power-law index is specified according to 
Cn ~ £"", where typically 2.4 < a < 2.9 (Platania et al. 
1998; Giardino et al. 2001, 2002; La Porta et al. 2008), 
and the spherical harmonic multipole is related to the 
uw-plane baseline length by ^ ~ 2itu. 

We construct our model of the diffuse Galactic syn- 
chrotron radiation in two steps. First, we produce a 
Gaussian random field with angular power-law index of 
a — 2.55. The amplitude of this field is normalized to 
both the mean and angular fluctuation power in the re- 
gion of the all-sky map of Haslam et al. (1982) centered 
on R.A. = 60° and Dec. — —30°. This region has been 
chosen as the primary target field for the MWA due to 
its relatively low brightness temperature and its location 
roughly opposite the Galactic center. In order to produce 
appropriate amplitude normalizations, the temperatures 
in the Haslam et al. (1982) map are scaled from 408 MHz 
to 150 MHz using a constant spectral index oi (3 — 2.6. 
The results of this scaling are shown in Figure 2. Since 
the Haslam et al. (1982) map is lacking angular infor- 
mation on scales smaller than about 1°, only multipoles 
below £ < 200 were used to normalize the angular fluctu- 
ation power in our model. Figure 3 illustrates the angular 
power spectrum of our model at 150 MHz, as well as the 
power spectrum derived from the Haslam et al. (1982) 
map. To complete our synchrotron model, the second 
step is to assign a spectrum for each line-of-sight in the 
simulated map. We specified each spectrum as a power 
law with spectral index drawn from a second Gaussian 
random field. The spectral index field has /3 = 2.6 and 
angular structure given by the sum of two terms: 1) a 
white noise contribution with ct/j = 0.1 and 2) the sim- 
ulated brightness temperature map at 150 MHz rcnor- 
malized such that it yields an additional variance in the 
spectral index map of CT/j = 0.05. Thus, the spectral 
index map is weakly correlated with the brightness tem- 
perature map. The level of correlation was set by eye 
from inspection of derived brightness temperature and 
spectral index maps including, in particular, the maps 
of Platania et al. (2003), although we note that those 
results were found at higher frequencies. 

2.2.2. Extragalactic Continuum Sources 

Normal galaxies, radio galaxies, and active galactic 
nuclei, form the majority of the extragalactic contin- 
uum sources (Santos et al. 2005). Their contribution 
to the brightness temperature has been found to be of 
order 30 to 70 K at 178 MHz by Bridle (1967) by fitting 
an isotropic background component to measurements of 
spatial variations in the spectral index of the diffuse emis- 
sion between 18 and 404 MHz. In the same analysis, 
it was found that the integrated brightness temperature 
due to extragalactic sources has a characteristic spec- 
tral index oi P — 2.7, although with ^ 10% uncertainty, 
and therefore is somewhat steeper than the Galactic syn- 
chrotron spectral index. Individual sources have been 
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Fig. 2. — Brightness temperature of the Galactic synchrotron foreground at i/ = 150 MHz. The all-sky measurements of Haslam et al. 
(1982) were scaled from 408 MHz using a constant spectral index of /3 = 2.6 to produce this map. The declination of the zenith at the MWA 
latitude is shown (horizontal dotted line), as is the declination range within zenith angle of 45° (horizontal solid lines). The MWA will 
have access to cold regions (^ 250 K) of the sky both north and south of the Galactic plane. The primary beam of the MWA antenna tiles 
is shown centered at the two planned target fields, marked by plus symbols at 60 R.A. by -30 declination and 115 R.A. by 10 declination. 
The concentric circles indicate the 50% (solid), and 10% (dashed) power response relative to the peak response. 

observed to have significantly more variation in their 
spectral indices, however, spanning —2 ^ /3 ^ 3. The 
inverted cases {(3 « —2) are due to sources with strong 
synchrotron self absorption, such that the spectrum has 
already flattened at the frequencies of interest (as op- 
posed to flattening below ^ 10 MHz, as would be more 
typical). In general, the spectral index of discrete extra- 
galactic sources is related to the intensity of the source. 
It is steeper for brighter sources (with maximum [3 ~ 3) 
than for fainter sources, where /3 ~ 2.5 is typical for 
sources with fluxes less than S < 0.1 Jy (Kellermann 
et al. 1969; Zhang et al. 2003; Cohen et al. 2004). 

A number of surveys of radio sources have been per- 
formed at frequencies relevant to the MWA (see Cohen 
et al. (2004, their Fig. 1)). They include a survey at 
74 MHz by Cohen et al. (2004), the 7C (McGilchrist et al. 
1990) and 6C surveys (Hales et al. 1988) at 151 MHz, 
the SCR survey (Laing et al. 1983) of bright (> 10 Jy) 
sources at 178 MHz, WENSS (Rengelink et al. 1997) 
at 327 MHz, and the 5C5 survey (Pearson 1975) at 
408 MHz, as well as the FIRST survey (White et al. 1997) 
and NVSS (Condon et al. 1998) at 1.4 GHz. Analysis 
of the catalogs of radio sources derived from these and 
other surveys have yielded a solid understanding of the 
statistical properties of radio sources (Perley & Erickson 
1984; Windhorst et al. 1985; Wieringa 1991; Fomalont 
et al. 1991; Windhorst et al. 1993). The brightest sources 
in the catalogs are relatively nearby objects and, thus, 
have differential source counts given by N{S) ~ 5"^'^, as 
would be expected for a population of objects distributed 
evenly in a Euclidean space. Weaker sources are gener- 
ally more distant and the observed differential counts fall 
off more gradually, as N{S) ~ 5*^^-^, due to evolution of 
the population and redshift effects. At extremely faint 
flux levels (in the fiJy regime), a new population of lo- 
cal starburst galaxies becomes visible and the differential 
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Fig. 3. — Angular power spectra of the two model foreground 
components along with the power spectrum derived from the 
Haslam et al. (1982) map (scaled to 150 MHz) over the region 
spanned by the MWA primary target field. The difi'use Galac- 
tic synchrotron emission dominates the foreground model at large 
scales (low £), where its angular fiuctuation amplitude and power- 
law index have been matched to the Haslam et al. (1982) map. For 
small, sub-degree scales (£ > 200), the power in the Haslam et al. 
(1982) map drops off rapidly due to its limited angular resolution, 
whereas the Galactic component of the foreground model continues 
its pure a = 2.55 power-law profile. At very small scales {£ > 1000), 
the angular power of the extragalactic continuum source model be- 
comes comparable to the diffuse Galactic power since the contin- 
uum source fluctuations have a Poisson-noise floor. 
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counts again become steeper, with N{S) ^ S^'^'^. Differ- 
ential source counts cannot asymptote to this power-law 
as S' ^- 0, however, since it is steeper than N{S) ~ S~^ 
and an infinite radio flux would result. The differential 
counts must eventually flatten again with decreasing flux. 
For our extragalactic continuum source model, we first 
produce a simulated catalog of radio sources and then 
draw from that catalog to sum over individual sources 
within each pixel of a simulated sky map. We use the 
expression for differential source counts given by Sub- 
rahmanyan & Ekers (2002), 



Nis) = 100 s;,Y ^, 



-0.8 
GHz 



arcmm 



MJy" 



(2) 



where 5'^jy is the flux measured in /xJy and j^ghz is the 
frequency measured in GHz. Based on this equation, we 
estimate that it should be possible to identify individual 
sources in sky maps produced by the MWA down to a 
flux limit of approximately 10 mJy before the sources be- 
come confused (in other words, there will be more than 
approximately one source per pixel below 10 mJy and, 
hence, individual sources will not be identifiable). Since 
the integrated flux due to faint radio sources in Equa- 
tion 2 converges around 10 /iJy, and since we want to 
limit the number of sources in our simulated catalog, we 
only model sources between 10 /iJy < S* < 10 mJy. For 
each source in our simulated catalog, we randomly assign 
a flux such that Equation 2 is satisfied. We also randomly 
assign a spectral index for each source from a distribution 
of spectral indices that is the suni of two Gaussian dis- 
tributions. One distribution has /3 = 2.7 and ap — 0.1, 
and the second distribution has (3 = and a/s = 0.6 in 
order to account for the large variation of spectral in- 
dices in extragalactic continuum sources. Only 10% of 
the sources have their spectral indices drawn from the 
second distribution. 

In order to produce a realistic sky map from our sim- 
ulated source catalog, the angular correlation of radio 
sources must be taken into account. Blake & Wall (2002) 
analyzed the 1.4 GHz NRAO VLA Sky Survey (NVSS) 
and found evidence for weak clustering of the continuum 
point sources. The angular correlation function that they 
derived is 

w{9) = 10-3 0-0-8^ (3) 

for measured in degrees. The magnitude of this clus- 
tering is much lower than for the galaxy correlation func- 
tions determined optically, and is the result of projection 
effects. Deep radio catalogs contain sources spanning a 
large range of redshifts since their continuum spectra do 
not allow for grouping by photometric redshift measure- 
ments. This causes the angular correlation function of ra- 
dio continuum sources to dilute since unrelated volumes 
of the Universe are superimposed in the observations. 
We follow the method of Gonzalez-Nuevo et al. (2005) 
to make a realization of the source counts in pixels that 
includes both the Poisson variance and the proper angu- 
lar clustering. The large scale structure introduced by 
the angular clustering adds an additional variance to the 
distribution of point sources and the resulting brightness 
temperature in the sky map. Using this source count 
map, we populate each line-of-sight in the extragalactic 
continuum source foreground model by drawing the ap- 
propriate number of sources from our simulated catalog 
and summing their contributions. Figure 3 shows the 



angular power spectrum resulting from this process at 
150 MHz. 

2.2.3. Redshifted 21 cm Signal 

Since the purpose of this work is to study the effects of 
a realistic interferometer on foreground subtraction, we 
employ a simple model for the redshifted 21 cm signal. 
Averaging over velocity field distortions, the brightness 
temperature of diffuse redshifted 21 cm emission can be 
described by. 
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(4) 
where 6 is the matter density perturbation field, xhi 
is the neutral fraction of hydrogen, Tcmb = 2.725 K 
is the CMB temperature, and Ts is the spin tempera- 
ture describing the relative population of the 21 cm hy- 
perfine states. For our simple model, we assume that 
the IGM remains fully neutral at our target redshift of 
z = 8 and that the spin temperature of neutral hydro- 
gen is Ts S> Tcmb- The only fluctuations in the 21 cm 
brightness temperature, therefore, are due to matter den- 
sity perturbations, which we model as a Gaussian ran- 
dom field with a power spectrum given by the output 
of CMBFAST (Seljak & Zaldarriaga 1996) and scaled to 
the target redshift according to the linear growth factor. 

2.3. Instrumental Simulation 
We start with our model of the frequency-dependent 

sky brightness I{{6,i/}), as detailed in §2.2. For the re- 
mainder of this section we will drop the explicit frequency 
dependence of all the terms to make the notation clearer. 
The result of observing the true sky with the MWA is a 
vector, m, of complex visibility measurements. This pro- 
cess can be expressed as 



m(w) = M(^,^) I{e)+n{v), 



(5) 



where v indexes the individual measurements, M is the 
instrumental response matrix that maps the true sky to 
the resulting measurements, and n is a vector containing 
the system noise added to each measurement. 

In observations with the MWA the visibility data must 
be calibrated and compressed so that long integrations 
can be archived for later analysis, and this process is 
detailed in Morales & Matejek (2008); MitcheU et al. 
(2008). Assuming perfect calibration, the resulting im- 
ages and Fourier representations can be simulated as 



I'{9) = B{9j) I{9)+n{ff) 
/'(u) = B(u,u)/(u) + n(u). 



(6) 

(7) 



Equations 6 and 7 are Fourier transforms of one another 
and the operator B represents the frequency dependent 
array beam and antenna field of view and / is the true sky 
in either angular or transformed coordinates. /' in the 
first line is the standard interferometric dirty map, and 
the gridded ww-plane in the second line. Depending on 
the application we will alternate between these two forms 
of the equation. The first line is useful when fitting spec- 
tra to individual image pixels, but has the disadvantage 
of a highly covariant noise matrix N = n'^ {9)n{9) . The 
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second line has (to good approximation) a diagonal noise 
matrix that is simply related to the density of visibility 
measurements in the uu-plane. 

Relating these expressions to the polynomial-based 
foreground subtraction investigations of Furlanetto & 
Briggs (2004), Wang et al. (2006), and Gleser et al. 
(2008), we find that these works treated simplified cases 
of this general approach by assuming complete coverage 
of the visibility measurements in the uu-plane, equivalent 
to using B = I, where I is the identity matrix, and by 
setting n(u) = or assuming n(u) to be constant so that 
the noise covariance matrix, N, is diagonal in the image 
domain, even though this is not attainable for a realistic 
distribution of antenna elements. Jelic et al. (2008) have 
improved on these initial approximations by including 
the planned coverage of visibility measurements in the 
uw-plane for LOFAR, although the frequency dependence 
of their ww-coverage is specifically removed by excising 
baselines lacking corresponding measurements at all fre- 
quencies and they neglect the frequency-dependence of 
the primary antenna beam. Here, we extend the grow- 
ing foundation of these efforts by investigating the full 
frequency-dependent effects of including a realistic an- 
tenna distribution and primary antenna FOV for the 
MWA to calculate B and n. We expand B into two 
primary constituents 

B = Barray Bant) (8) 

where Barray accouuts for the synthesized array beam 
due to the coverage of the visibility measurements in the 
uu-plane and Bant accounts for the response of the pri- 
mary beam of the antenna tiles. For Bant we use the 
window function in Equation 1, and for Barray we com- 
pute the typical uu-coverage of an MWA observation in- 
cluding earth-rotation synthesis for a 90° track through 
zenith as shown in Figure 4. 

For the noise contribution to the simulated observa- 
tions, we take n{v) to be a vector of Gaussian random 
variables with standard deviation (Morales 2005) 

a„ = ^^^, (9) 

clA Vdiy T 

where ks is the Boltzmann constant, Tsys — 440 K 
is a conservative estimate of the system temperature 
(including sky noise) of the MWA at v = 158 MHz, 
dA = 14.4 m^ is the effective collecting area of an MWA 
antenna tile at the same frequency, dv = 40 kHz is the 
spectral resolution of a single channel, and r = 8 s is the 
accumulation duration for each visibility measurement. 
We approximate n(u) by pixelizing the uv plane with 
cells equal in area to the inverse of the antenna FOV. 
The contribution of any single visibility measurement is 
then applied to only one grid cell. In this limit, the ther- 
mal noise is uncorrelated in the Fourier domain maps 
and given by, n(u) — ct„/-\/7V(u), where N{u.) is the 
number of visibility measurements contributing to each 
grid cell. Using the radiometer equation with the mean 
filling fraction for the MWA antenna configuration gives 
a fiducial thermal uncertainty for a derived image map of 
^ 335 mK after 360 hours, although in practice the ther- 
mal noise is dependent on angular scale size and improves 
rapidly as scale size increases due to the highly condensed 
antenna distribution such that at degree-scales the ther- 
mal uncertainty is only ^ 5 mK. 



The final component of modelling the foreground sub- 
traction is to actually fit and subtract the foreground 
model from the sky brightness map derived from the 
observations. There are several reasonable methods 
that can be used to derive the sky brightness from the 
archived measurements. The most ideal map to use 
would be a minimally biased estimate of the true sky. 
This would require inverting the instrumental response 
matrix, B, and the noise covariance matrix (Tegmark 
et al. 1997). In practice, it will not be feasible to in- 
vert B and our goal is to set a worst-case limit due to 
the effects of the instrumental response of the MWA on 
subtracting foreground contamination, and a minimally 
biased estimate of the sky should remove many of these 
effects. Thus, we turn instead to what is commonly called 
visibility weighting of the dirty sky map in radio astron- 
omy. This is the map that results by simply multiplying 
the uv representation by a position-dependent weight, 
or equivalently Wiener filtering. The weighted dirty sky 
map is given by 



/ae)=F(0,u)U(u,u)/'(u) 



(10) 



where F is the Fourier transform operator and U is a 
weighting function for each pixel in the Fourier map 
(Sauh 1984; Briggs et al. 1999; Thompson et al. 2001). 
The smallest uncertainty in the derived sky map due to 
thermal noise is achieved by weighting by the inverse of 
the variance due to the (Gaussian) thermal noise in each 
pixel in the Fourier domain. This is commonly called 
natural weighting, and in this formalism corresponds to 
U = B if the noise in each visibility is the same. Al- 
though it produces the least uncertainty, this method 
typically emphasizes the information contained in short 
baselines because short baselines are more numerous in 
radio interferometers than long baselines. Thus, the 
effective resolution of the derived sky brightness map 
is lower than the \/ D expectation. Another common 
method, called uniform weighting, alleviates this under- 
rcsolution at the expense of introducing more uncertainty 
into the derived sky map. Uniform weighting, like its 
name suggests, is constant for each pixel in the Fourier 
domain. This allows the information in the long base- 
lines to be emphasized and restores the effective reso- 
lution of the sky map to the expected level. Figure 5 
illustrates the difference in the weighting functions. We 
will test both of these weighting methods, as well as an 
approximation to the natural weighting function derived 
by fitting a double exponential to the distribution of the 
number of baselines per uw-cell. We will see if the fore- 
ground subtraction technique is robust to any effects the 
weighting scheme might introduce. 

Once a dirty sky map has been generated, a low-order 
polynomial is fit to each pixel in the map and subtracted. 
The residual contamination, r(0), after this subtraction 
is given by 



ruie) = luio) - d{e) 



(11) 



where d{9) is the "dirty" foreground model generated 
from the polynomial fits. However, regardless of the 
weighting used to determine the model d{9), we want to 
subtract this model from the unweighted data to preserve 
the signal-to-noise ratio of the original signal. Leaving 
the weighting function in the Fourier domain, the resid- 



ual in the unweighted data is given by 



r'{0) 



?,u)U"^(u,u) F{u,9) ru{0) 



= F-\e,u)ij-^{u,u) F{u,e)[iu{e) - d{e)] 

= I' {6) - F-i(6',u)U-^(u,u) F(u,6i) ^(6*) 
r'{e)= I'iO) d'{e) (12) 

where d'{e) = F-i(6', u)U-i(u, u) F(u,6i) d{e). The 
residual, r', in the Fourier domain map is the quantity 
we are interested in determining, and that we hope wiU 
yield the expected redshifted 21 cm fluctuations. 

3. MEASURING THE CONFUSION LEVEL FOREGROUND 

For all the cases discussed in this section, a 3'^'^-order 
polynomial is used to fit the full 32 MHz of observed 
spectrum along each pixel of the sky map according to 



die.v 



asiey + a2{ey + ai{0)iy + ao{9) (13) 



where d is the model fit introduced in Equation 11, and 
the coefficients a„ are solved on a per pixel basis to min- 
imize the least squared difference between the weighted 
dirty sky map, /[/, and the polynomial fit. The spectral 
fits are performed in linear coordinates. Fitting polyno- 
mials in logarithmic space has been considered by Wang 
et al. (2006) and others, but we find that, since the re- 
sponse of the interferometer acts to remove the mean 
value of each spectral channel and causes the value of 
some pixels to be less than zero, logarithmic fits pose 
extra complications. 

Before we consider the effects of the instrumental re- 
sponse, we first review the ability of the polynomial 
fit technique to remove the foreground contributions in 
the image domain representation of our foreground sky 
model alone (with no 21 cm signal). This corresponds 
to the principal scenarios addressed by previous efforts 
(Di Matteo et al. 2004; Zaldarriaga et al. 2004; Furlan- 
etto & Briggs 2004; Gnedin & Shaver 2004; Santos et al. 
2005; Wang et al. 2006; McQuinn et al. 2006; Gleser et al. 
2008). Figure 6 illustrates the results of performing the 
subtraction on the foreground-only sky model with no 
instrumental response. The top row of the Figure shows 
the input sky data cube. The left panel is a map from 
a single frequency channel, whereas the right panel is 
a I'-Oy plane that slices along the frequency axis. The 
second row of the Figure shows the residuals in the data 
cube following the polynomial subtraction (again the left 
panel is a map from a single frequency channel, whereas 
the right panel is a slice along the frequency axis). Fi- 
nally, the bottom row of the Figure shows the Fourier 
domain representation of the residuals in the data cube. 
The left panel in this case is the uw-plane for a single fre- 
quency channel and the right panel is a v-v plane slicing 
along the frequency direction. The residuals can be seen 
in the second row to be of order ~ 1 mK. This represents 
the best-case scenario that can be achieved by the 3^^^- 
order polynomial fit method to our foreground model. 
In general, the amplitude of the residuals could be made 
arbitrarily low by increasing the order of the polynomial 
used in the fits, but only at the expense of extracting 
additional power from the more rapidly varying spectral 
ffuctuations. Since, in actual observations, the polyno- 
mial fit will remove power from the redshifted 21 cm 
ffuctuations as well as the foreground contributions, it 



is desirable to use the lowest order polynomial feasible 
because this will limit the effects on the redshifted 21 cm 
power spectrum to only the longest length scales in the 
spectral domain. The bottom panel shows that the resid- 
ual angular power is peaked at large scales (low u). The 
dark vertical bands in the bottom-right panel are at the 
zero-crossings of the residual power, which tend to occur 
at roughly the same frequencies for all sight-lines. As was 
concluded by the previous efforts in the literature, it ap- 
pears that, in an ideal case at least, the polynomial sub- 
traction technique is sufficient to remove the foreground 
contribution in the sky to the level required to detect the 
expected 21 cm signal during reionization. 

Next, we include the instrumental response in the anal- 
ysis. Now there is an additional degree of freedom. 
The weighting function, U, that is applied to the sim- 
ulated gridded measurements before transforming from 
the Fourier to the image domain (see Equation 10) must 
be specified before the calculation can be performed. As 
discussed in the last section, we model uniform weight- 
ing, natural weighting, and a smoothed approximation to 
the natural weighting found by fitting a double exponen- 
tial profile to the distribution of baselines (see Figure 5). 
The results of the foreground subtraction technique per- 
formed on the simulated dirty sky maps for each of these 
weightings are presented in Figures 7 through 9, respec- 
tively. Figure 10 shows a direct comparison of the dirty 
map subtraction residuals for the same line of sight using 
the three different weighting schemes. 

3.1. Uniform Weighting 

The first case is that of uniform weighting and is shown 
in Figure 7. Here, the top row of the figure shows the 
dirty sky map, Id{9), produced by this weighting, the 
second row shows the results of the foreground subtrac- 
tion in the image domain, r{9), and the bottom row 
shows the results of the foreground subtraction after 
transforming back to the Fourier domain and remov- 
ing the weighting, r'{u). In all cases, the thermal noise 
contribution has been artificially removed following the 
analysis to more clearly illustrate the foreground resid- 
uals. As discussed in Section 2.3, this weighting gives 
the best effective angular resolution, but increases the 
uncertainty due to thermal noise in the dirty sky map. 

At first glance, the results of the foreground subtrac- 
tion in this case do not look especially promising since 
there are of order ~ 1 K fluctuations in the residual 
map, r{9), in the second row of Figure 7. However, af- 
ter the residual map is transformed back to the Fourier 
domain, it becomes evident that the polynomial fit has 
actually done an excellent job of subtracting the fore- 
ground contamination from baselines within a radius of 
u < 500A, and only a poor job for baselines beyond this 
radius. The transition from the region where the fore- 
ground subtraction was successful to the region where it 
failed is rapid and coincides with the radius where visi- 
bility measurements with the MWA become sufficiently 
sparse that there is no longer complete coverage. In this 
outer region, a given ut;-pixel is typically not sampled 
at all frequencies. This creates variations at the corre- 
sponding angular scales between image maps at different 
frequencies and results in additional spectral structure 
in the image domain data cube. Thus, on large angular 
scales, uniform weighting produces a dirty map with an 
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Fig. 4. — Realization of the MWA antenna layout (left) and corresponding distribution of baselines (middle). The 500 antennas are nearly 
coplanar and distributed over a 1500 m diameter region with a density that falls off as ~ r~^. The baselines in the uv-p\a.ne are formed 
by the pairwise combination of all the antennas for an instantaneous observation of a target at the zenith. The right panel illustrates the 
relative density of visibility measurements after earth-rotation synthesis has been performed by tracking a target field at R.A. = —60° and 
Dec. = —30° for six hours as it transits. 

carded or selectively cut before producing the dirty map 
in order to yield a frequency-independent ww-coverage. 
A variation of this approach was applied in Jelic et al. 
(2008) for the planned LOFAR configuration. 



3.2. Natural Weighting 

The foreground subtraction results for the natural 
weighting case are shown in Figure 8. It is clear even 
in the second row of Figure 8 that the residual fluctua- 
tions in the dirty map are on much larger angular and 
spectral scales and have larger amplitudes than for the 
uniform weighting case, with ~ 10 K fluctuations that are 
an order of magnitude greater than the expected 21 cm 
signal. And in this case, the residuals persist at all scales 
after returning the Fourier domain and plotting r'{u) in 
the third row of the figure. The best explanation for 
the poor subtraction obtained with the natural weight- 
ing can be found in the profiles of the weighting functions 
shown in Figure 5. The fine, hash-like fluctuations that 
are visible in the profile for the natural weighting due to 
the discrete antenna positions in the array are responsi- 
ble for introducing ripples and fluctuations into the dirty 
sky cube generated with this weighting. So, although 
natural weighting would result in the least thermal un- 
certainty in the dirty sky map, it does so at the expense 
of coupling the variations in the density of visibility mea- 
surements in the uu-plane to the dirty sky map. This is 
exactly what we would like to avoid in the sky map used 
for the foreground subtraction. 

We can estimate the importance of structure in the uv 
coverage at a given length scale. In general, the synthe- 
sized array beams at each frequency measured by an in- 
terferometer will be identical, but proportional in overall 
size to the inverse of the frequency, since the distribution 
of visibility measurements is identical, but scales with the 
wavelength in the uw-plane. The coherence length of fluc- 
tuations in visibility coverage in the ww-plane is mapped 
to the spectral domain by this frequency-dependent scal- 
ing. The relationship between the coherence length in the 
uu-plane, Z„, and the coherence length in the frequency 



Fig. 5. — Illustration of three different weighting functions used 
for forming the dirty sky maps in Figures 7 through 9. Each profile 
is for a north-south cross-section through the center of the baseline 
distribution after earth-rotation synthesis on the target field, and 
are for uniform weighting (dash line), natural weighting (gray line) 
and the smoothed double exponential approximation to natural 
weighting (black line). The profiles are normalized in this plot such 
that the maximum weight is one. The natural weighting profiles 
favor the short baselines, while the uniform weight accentuates the 
long baselines. 

effective PSF resembling a clean (5-function yielding no 
corresponding sidelobe confusion since there is complete 
uu-coverage over all frequencies, whereas for small angu- 
lar scales, the effective PSF is poorly behaved with large 
sidelobes. This comes at the expense of elevated thermal 
noise in the dirty map due to the sub-optimal weighting 
of the Mw-plane measurements when producing the map. 
The bottom row of Figure 7 illustrates that it may not 
be necessary to selectively tailor the ww-plane coverage 
to be frequency independent for the MWA. As long as 
the scales of interest are sampled completely for all fre- 
quencies, the sparsely sampled region does not interfere 
with the scales of interest following foreground subtrac- 
tion. Nevertheless, if the extra sidelobe noise in the dirty 
map due to the sparse coverage above u > 500A is not 
desired, then the long baselines could be completely dis- 
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Fig. 6. — Polynomial foreground subtraction technique applied to the input foreground sky model excluding the simulated instrumental 
response. Thermal noise and the 21 cm signal have been excluded from this simulation, as well, in order to provide a best-case reference for 
the foreground subtraction technique. The first row shows two cuts through the foreground sky model. The left panel is an image plane for 
a spectral channel at !^ = 157 MHz, thus the pixels in the plot are indexed by I'{6x,9y)- The right panel is a slice through frequency and 
one angular dimension, thus the pixels in the plot are indexed through I'(i', By). The smooth spectral properties of the foreground model 
are easily visible in the righthand plot. The middle row displays the residuals, r, in the image domain after the 3'^'^-order polynomial has 
been fit and subtracted from the spectra along each line of sight in the data cube. Again, the left panel is an image plane and the right 
panel is a slice through frequency and one angular dimension. The bottom row displays the residuals in the Fourier domain. Although 
it appears in the middle row that the residuals after foreground subtraction will be comparable to the ~ 25 niK redshifted 21 cm signal, 
after transforming the residuals back to the Fourier domain (bottom row), it is evident that the polynomial fit and subtraction removed 
the foreground contribution well for all but the largest angular scales. 
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Fig. 7. — Polynomial foreground subtraction technique applied to the simulated dirty sky map generated with a uniform weighting 
of the visibility map. The sky model used for the subtraction included only the foreground and thermal noise terms and excluded the 
21 cm signal. The thermal noise contribution, however, has been artificially removed from these plots in order to better illustrate the 
foreground subtraction residuals. In this case, the bottom row displays the unweighted residuals, r', in the Fourier domain. Similar to the 
reference case in Figure 6, it appears in the middle row that the residuals after foreground subtraction will be much larger this time than 
the ~ 25 mK redshifted 21 cm signal. However, after transforming the residuals back to the Fourier domain and unweighting, we find that 
the polynomial fit and subtraction removed the foreground contribution well again for most baselines. Aside from the short baselines that 
were poorly cleaned in even the instrument-free case, only baselines longer than ~ 500A are significantly contaminated when performing 
the subtraction in the dirty map generated with uniform weighting. 
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Fig. 8. — Same as Figure 7, but for the polynomial subtraction technique applied to the dirty sky map generated with a natural weighting 
of the visibility map. No 21 cm signal was included in the analysis and the thermal noise has been artificially removed from the plots. As 
with the uniform weighting, the residuals, r, in the image domain appear much larger than for the best-case fits to the sky model alone. 
In this case, however, even after transforming back to the Fourier domain and unweighting, the residuals remain much larger than for the 
other weighting methods. 

domain, l^, is given by 



L 
vo 



(14) 



where |m| is the radius of interest in the uu-plane at fre- 
quency vq. From this expression, it is easy to see that 
modest features in the coverage of the visibihty measure- 
ments near the origin of the uv-plane do not introduce 
significant fluctuations into the spectral domain because 
Zi, ^ cxD as \u\ — > 0, whereas variations in the cover- 
age or weighting at large radii in the Mw-plane will have 
much shorter coherence lengths in the spectral domain 
because Z^ — > as |m| ^ oo. In order to produce fluctu- 
ations in the spectral domain that have sufficiently long 
periods that they can be flt by a low-order polynomial 
across the observed bandwidth requires l^ > B, where 
B is the bandwidth. For the MWA, with vq w 150 MHz 



and S = 8 to 32 MHz, this condition requires that 

7 >l 1^ ^ l"l 



(15) 



in order to prevent fluctuations in the spatial coverage 
from impacting the polynomial fits used in the fore- 
ground subtraction. Referring to Figure 5, this condition 
is clearly not met for the natural weighting function with 
its fine-scale features. 

3.3. Smoothed Natural Weighting 

We have seen so far that the uniform weighting al- 
lows foreground subtraction to succeed well below the 
required level, whereas the natural weighting does not. 
However, the uniform weighting succeeds at the expense 
of increasing the thermal noise in the derived dirty map 
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Fig. 9. — Same as Figure 7, but for the polynomial subtraction technique applied to the dirty sky map generated with the smoothed 
approximation of the natural weighting of the visibility map. No 21 cm signal was included in the analysis and the thermal noise has been 
artificially removed from the plots. For this case, the residuals, r, in the image domain are the lowest for the three dirty sky maps tested, 
but are still two orders or magnitude greater than the best-case subtraction for the sky model with no instrumental response. Transforming 
back to the Fourier domain and unweighting yields residuals that are nearly as low as the uniform weighting case. 



by nearly an order of magnitude compared to the nat- 
ural weighting. Since the natural weighting fails due to 
the fine-scale structure in the baseline distribution of the 
array, we explored smoothing the natural weighting func- 
tion by approximating the distribution of baselines with 
a double exponential profile given by (see Figure 5) 

iV(w)-e-l"l/'i+e-l"l/'^ (16) 

where /i = 90A and h = 190A, and the amplitude is set 
to match the true baseline distribution. The function 
is cored for |u| < 6A, approaching linearly, in order 
to prevent over-weighting these scales compared to the 
true natural weighting. This weighting function retains 
the emphasis on the short baselines inherent in the natu- 
ral weighting case, but without introducing the fine-scale 
variations in the density of the visibility measurements 
into the weighting function. In our simulation, it resulted 



in only a 7% increase in thermal noise in the dirty map 
compared to the true natural weighting. The results of 
the foreground subtraction for this case are shown in Fig- 
ure 9. This weighting function produces the best results 
in r{6) and nearly matches the uniform weighting case in 
r'{u). 

3.4. Review 

We pause for a moment to review the results of the 
analysis so far. At this stage, we can conclude that sub- 
traction of confusion-level foreground contamination in 
dirty sky maps produced by the MWA is possible using 
the simple "per pixel" polynomial subtraction technique. 
Uniform weighting of the griddcd visibility measurements 
during the dirty map generation process produces the 
best results of the three cases we tested. If maximizing 
the S/N in the derived dirty map while preserving the pu- 
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Fig. 10. — Polynomial subtraction residuals along the same line 
of sight in the dirty sky data cubes for the three weighting schemes 
as well as for the pure foreground sky model with no instrumental 
response simulated. No 21 cm signal was included and thermal 
noise has been artificially removed. The weighting schemes arc 
uniform (thick solid line), natural (dash-dot line), exponential (thin 
solid), and the foreground model alone (dotted line). 

rity of the spectral domain from mode-mixing contam- 
ination is desired, then we have shown that a smooth 
approximation to the natural weighting function yields 
similar results to the uniform weighting, but without sig- 
nificantly increasing the thermal noise in the dirty map 
above the best-case natural weighting level. With addi- 
tional studies, it seems likely that the smoothed variation 
of the natural weighting could be made to match the per- 
formance of the uniform weighting. 

It may seem counter-intuitive that the success of the 
polynomial subtraction algorithm should be largely in- 
dependent of the level of thermal noise in the input dirty 
map, so we note that this appearance is due, in part, to 
the fact that we have artificially removed thermal noise 
from the residuals plotted in Figures 7 through 9 in or- 
der to facilitate an unhindered comparison of the differ- 
ent weighting schemes. The derived foreground models 
do include the contributions of thermal noise, as do the 
resulting residuals. However, our interest is not in the 
uncertainty of the foreground model, d{9), but rather the 
uncertainty in the recovered 21 cm power spectrum, esti- 
mated from r' (u) . We have thus far postponed our book- 
keeping of the thermal uncertainty so as not to confuse 
to the two. In the remainder of this paper, we address 
this issue of thermal noise in the measurements and the 
ability of the MWA to detect the redshifted 21 cm signal 
after applying the foreground subtraction steps discussed 
above. Since we have seen that the uniform weighting 
case is sufficient for foreground subtraction, we will use 
only uniform weighting in the analysis below for clarity 
of presentation. 

4. RECOVERING THE EOR SIGNAL 

After foreground subtraction, the ~ 10* individual 
measurements comprising an MWA data cube contain 
largely thermal noise power and only a very small contri- 
bution from the redshifted 21 cm signal (and foreground 
residuals). To detect the EoR signal we need to average 
over all of these measurements to obtain 10-20 estima- 
tors with sufficient S/N to detect and characterize the 
reionization signal. The foundation for a statistical EoR 
power spectrum measurement of the brightness tempera- 
ture fluctuations in low-frequency, wide-field radio obser- 
vations has been developed in the literature by Morales 
& Hewitt (2004) and Zaldarriaga et al. (2004). These 



efforts are built on the similar approach employed for in- 
terferometric measurements of CMB anisotropics (White 
et al. 1999; Hobson & Maisinger 2002; Myers et al. 2003). 
The primary approach is to convert the three dimensional 
measurement cube to a one dimensional power spectrum. 

Following foreground subtraction, the full 32 MHz data 
cube will be divided into 8 MHz sub-bands, each of which 
will be reduced into a power spectrum independently. 
This approach is planned for the MWA analysis pipeline 
because the redshifted 21 cm signal is expected to evolve 
considerably over the Az « 2 redshift range spanned by 
the full 32 MHz observation. By treating four sub-bands 
independently, any evolution of the signal can be largely 
neglected in the interpretation of the results since each 
sub-band will span only Az « 0.5. The effect that this 
simplification has on the power spectra measurements is 
to eliminate measurements of the largest spatial scales, 
corresponding to the lowest fc|| modes, along the line-of- 
sight direction. Since these modes contain most of the 
foreground power, the are not expected to have yielded 
usable information about the 21 cm signal anyway. 

Each residual data sub-cube, r'({u, j/}), is trans- 
formed into three-dimensional wavenumber by applying 
a Fourier transform along the frequency-axis, v ^ rj, 
and then applying a coordinate transformation into cos- 
mological coordinates k: 

r'(k) = J(k, {u, r7})F({u, 77}, {u, 4)r'({u, ^}) (17) 

where J(k, {u, 77}) is given by the Jacobian of the coordi- 
nate transformation from u, 77 (in units of A and Hz^^) to 
k (in units of cMpc^\ see Morales & Hewitt (2004) for 
details). The residual r'(k) can be thought of as the 
three-dimensional Fourier transform of the foreground 
subtracted image cube as expressed in comoving Mpc. 

Due to the isotropy of space, the power spectrum is ap- 
proximately spherically symmetric in k coordinates. We 
can thus square r'(k) and average the result in spherical 
annuli 



P(fc) = (|r'(k)| 



(18) 



to form the one dimensional power spectrum (Morales & 
Hewitt 2004), or the more common dimensionless power 
spectrum given by A^ = fc^P(fc)/(27r^). However, it is 
useful to break the averaging from the three dimensional 
k-space to the one dimensional /c-space into two steps 
since both the foregrounds and a full treatment of the 
predicted redshifted 21 cm signal have aspherical struc- 
ture in the Fourier domain. Following the approach of 
McQuinn et al. (2006), we first average over the angu- 
lar sky direction to obtain P{k±, fc||). This is concep- 
tually similar to averaging over the m values and keep- 
ing the £ values in a CMB analysis, except we still have 
the line-of-sight dimension. In practice, the 2D and ID 
power spectra are determined following the maximum 
likelihood formalism that reduces to weighting the indi- 
vidual measurements by the inverse of the per-cell noise 
variance in the limit of no covariance terms between the 
cells. For the 2D power spectrum this yields: 



Eik,|.t,c/„(k) 
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and for the one dimensional power spectrum: 



P{k) 



E|kMU„(k,k)|r'(k)r 



(20) 



where Un is related to the natural weighting function 
used to generate dirty maps from the gridded visibil- 
ity measurements, but with spectral structure converted 
into spectral covariance terms by the Fourier transform 
along the frequency axis. Neglecting the effects of RFI 
and RRL excision, the fc|| covariance is highly peaked for 
the MWA due to the dense uv-coverage and we make 
the simplifying approximation in Equations 19 and 20 to 
treat it as a delta function in this analysis so that the 
covariance matrix for the k-space Fourier data cube is 
taken to remain diagonal and Un) can be treated as a 
multiplicate operator independent of fcii . 

Theoretical models indicate that the true redshifted 
21 cm signal will be composed of contributions (due to 
velocity field and other astrophysical effects) from modes 

that are modulated according to powers of /x = k ■ fi, 
where /i is the cosine of the angle between the line of 
sight and the wavevector (Kaiser 1987; Barkana & Locb 
2005; McQuinn et al. 2006). Thus, 

P21 (fc) = P^o (fc) + ^2p^, (fc) + ^4p^^ (fc) + . . . , (21) 

and can also be fully described by P2i(fcj_, fcy)- The sim- 
ple model of the 21 cm signal we arc using in this paper, 
however, contains only a single P^a contribution and the 
MWA will have insufficient sensitivity initially to detect 
the higher-order aspherical effects independently. 

Even in reduced power spectra, the 21 cm signal is 
weaker than the total thermal noise power and additional 
steps must be taken to isolate it. There are two pos- 
sible approaches for extracting the 21 cm contribution 
from the noise-dominated measurement: 1) calculate the 
thermal noise contribution to the auto-correlation gen- 
erated power spectrum and subtract it from the final 
result, or 2) generate the power spectrum by dividing 
the observation into two epochs of equal duration and 
then cross-correlating the data cubes from the two epochs 
(M. Tegmark, private communication). This second ap- 
proach preserves the persistent 21 cm signal and elimi- 
nates the thermal noise power (which will be independent 
between the two observing epochs and, therefore, aver- 
age to zero during the cross-correlation), leaving only 
the thermal uncertainty. It has the advantage that the 
system temperature does not need to be known indepen- 
dently to high precision in order to recover the 21 cm 
signal (as long as the thermal noise properties of the in- 
strument do not vary between epochs). We have chosen 
to follow this second approach and calculate the ID and 
2D power spectra as cross-power spectra by replacing the 
|r'(k)|2 terms in Eqns 18-20 with (r'{r'^ + ^2*^)72 such 
that Eqn 18 becomes: 



P{k) = 



(22) 



where r'l and r'2 are the residuals for the two epochs of 
the simulated observation. 

The panels in Figure 11 illustrate the post-subtraction 
residuals in the 2D power spectrum of the full simu- 
lated instrumental response. The left column of pan- 
els show the post-subtraction power spectra of the in- 
dividual and combined sky model components with: a) 



the foreground-only contribution to the sky model, b) 
the redshifted 21 cm-only contribution, and c) the full 
combined sky model. Thermal noise has been artificially 
removed from these three panels to highlight the struc- 
ture of the astrophysical sky. It is clear that the residual 
power in the post-subtraction, foreground-only contribu- 
tion in panel (a) is well below the 21 cm-only power in 
panel (b). Panel (c) confirms that the foreground sub- 
traction recovers the 21 cm signal since the subtraction 
results on the full sky model are nearly identical to the 
21 cm-only results. The residual foreground contamina- 
tion in panel (a) does not have the same symmetry as 
the redshifted 21 cm signal, and tends to have functional 
forms given by P(k) = P{ki_)P{k\\), as opposed to the 
spherical and /i-modulated 21 cm symmetries. Morales 
et al. (2006) have discussed using these differences in 
shape to further separate the 21 cm power spectrum from 
the foregrounds, although we do not pursue this step in 
the present analysis. 

The right column of panels in Figure 11 illustrates the 
thermal noise properties and the removal of the mean 
thermal noise power from the data cube by the cross- 
correlation. Panel (d) shows the total thermal noise in 
the simulated power spectrum after 360 hours of obser- 
vation with the MWA. Despite the long integration, the 
total thermal noise still dominates the desired 21 cm sig- 
nal by over an order of magnitude. After dividing the 
simulated observation into two epochs and calculating 
the cross-correlation, however, the mean thermal noise 
power is removed from the final power spectrum, leav- 
ing only the residual thermal uncertainty power shown 
isolated in panel (e) and with the full simulated measure- 
ment including the sky signal in panel (f ) . The MWA will 
not have enough sensitivity within its first year of ob- 
servations to produce a high S/N 2D power spectrum so 
the ID power spectrum will be the primary output prod- 
uct. The white curves in panel (f) indicate the spherical 
shells of constant-fc that are used to calculate the ID 
power spectrum in Figure 12. In addition to the recov- 
ered 21 cm power spectrum. Figure 12 also illustrates the 
total sky power and total thermal noise power in the ID 
power spectrum before subtraction and cross-correlation. 

We note that, during a year-long season of observing 
on the primary target window, the MWA should produce 
four power spectrum measurements, each comparable to 
the simulated results in Figures llf and 12, spanning the 
full rcdshift range 7.1 < z < 9 covered by the 32 MHz 
instrumental bandwidth in blocks of Az « 0.5, corre- 
sponding to the individual 8 MHz sub-bands of the di- 
vided data cube. In addition, a secondary target field 
should yield an independent set of observations produc- 
ing four additional similar power spectra, for a total of 
eight unique power spectra documenting the 21 cm emis- 
sion during the reionization epoch. 

4.1. Characterization of Subtraction Effects 

The foreground subtraction process introduces errors 
into the uncorrected derived estimate of the redshifted 
21 cm power spectrum. The two principal sources of 
possible error are: 1) the unavoidable removal of some of 
the large-scale power from the redshifted 21 cm signal by 
the polynomial fitting and subtraction algorithm, and 2) 
the contamination of the derived power spectrum by any 
residual foreground power resulting from an imperfect 
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Fig. 11. — 2D power spectra of post-subtraction residuals from dirty maps generated with uniform weighting. For the left column of 
panels, the thermal noise has been artificially removed. It is evident that the foreground-only residuals following subtraction are much 
lower than the 21 cm signal in the range k± ^ 3 X 10~^, beyond which corresponds to the outer anulus of poor foreground subtraction 
in the bottom panel of Figure 7. Panel c confirms that the 21 cm signal dominates the recovered power from the full sky model since it 
appears nearly identical to panel b. The white arcs in panel / illustrate the spherical shells of constant-fe that are used for the ID power 
spectrum in Figure 12. 

subtraction. These effects must be quantified tlirough 
modeling to accurately interpret real measurements. As 
an initial step, we introduce a subtraction characteriza- 
tion factor, /s, that captures the ratio of the recovered 
power spectrum to the true power spectrum such that 



/.(k)=P2i(k)/P^i(k), (23) 

where Pji is the true input 21 cm power spectrum, and 
P21 is the recovered output power spectrum following 
the simulated measurement and foreground subtraction 
process. 

Figure 13 plots the net characterization factor calcu- 
lated for our fiducial model. The bias due to perform- 
ing the polynomial fit and subtraction in the presence of 



foregrounds reduces the overall power in the recovered 
power spectrum at all scales (such that fs{k) < 1), al- 
though this effect is minor — less than 10% at any scale, 
and of order only ^ 1 % for fc > 10"^ Mpc"'^. This sug- 
gests that the polynomial subtraction process is remov- 
ing more power from the 21 cm signal than any residual 
foreground power following subtraction is adding to the 
recovered estimate. This conclusion is confirmed when 
we calculate the characterization factor for the recovered 
signal without any foregrounds in the sky model and find 
less than a 0.1% difference between this 21 cm-only case 
and the result for the full sky model shown in the Figure. 
As a counter point, we include in Figure 13 a separate 
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Fig. 12. — Simulated measurement of the ID spherically binned 
redshifted 21 cm power spectrum by the MWA in a single 8 MHz 
sub-band with 360 hours of integration. The thin solid line is the 
input 21 cm signal. The error bars give the thermal uncertainty 
on the recovered measurement and are Icr. Measurements below 
k < 10~^ Mpc~^ are highly coupled due to the limited depth of 
the observed comoving volume, as is evident from the constant-fe 
contours in the bottom-right panel of Figure 11. The thick solid 
line shows the total power (before subtraction) in the simulated 
sky data cube including thermal noise and foregrounds, and the 
thick dash line shows the total power of the thermal noise alone. 

curve (thin solid line) for the characterization factor cal- 
culated for the subtraction process on the full sky model 
using only a 2"'^-order polynomial. In this case, residual 
foreground power resulting from the poorer fit by the 
2"'^-order polynomial contributes more to the recovered 
21 cm power spectrum than the polynomial fit itself re- 
moves and yields /« > 1 over the entire spectrum. This 
is particularly evident for fc < 10^^ Mpc^'^, where it is 
clear that the lower-order polynomial failed to remove 
much of the foreground power. Finally, we note that in- 
creasing the polynomial to 4*'^-order has a minor effect 
on the recovered 21 cm signal (shown as the dashed line 
in Figure 13) that is evident only at large scales. This 
suggests that there is additional margin in the polyno- 
mial subtraction approach to handle more complicated 
foregrounds than currently anticipated without substan- 
tially degrading the recovered 21 cm signal. 

In principle, the characterization factor is dependent 
not only on the foreground subtraction algorithm, but 
also on the specific shapes of the redshifted 21 cm signal 
and the foregrounds. Since we have assumed a simple 
21 cm model with structure based only on a Gaussian 
random field of matter density perturbations as well as a 
fairly general model of the foreground contribution, the 
/s derived from this analysis should be taken only as a 
guide. 

5. DISCUSSION 

In the preceding sections, we have shown that the ef- 
fects of the frequency-dependent instrumental response 
of the MWA do not hinder the simple polynomial-fit dif- 
fuse foreground subtraction scheme to recover the red- 
shifted 21 cm signal. This is particularly encouraging 
because the fit has been performed in dirty sky maps, 
which are very sensitive to instrumental effects. Thus, 
we have established, in some sense, a worst-case scenario 
that appears manageable. 

Dense Mw-coverage is a new advancement for radio ar- 
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Fig. 13. — Subtraction characterization factor, /s, for the ID 
binned power spectrum. The characterization factor is shown for 
three different levels of polynomial subtraction. As expected, the 
largest error is for large spatial modes (low fc). Over most scales, 
however, the correction for our fiducial 3'''^-order polynomial sub- 
traction is only --^ 1%. 

rays. In the traditional paradigm, the uu-plane is almost 
always sparsely sampled (as with the VLA and other ex- 
isting arrays), and there are essentially no regions where 
the coverage can be said to be "complete" , as is the case 
for the MWA within u < 500A. The reason that this is 
generally not a problem for telescopes like the VLA is 
that, in most cases, these instruments are not attempt- 
ing to make observations at or below the confusion limit 
imposed by faint extragalactic continuum sources and 
diffuse Galactic structure. So, although the VLA has 
sparse coverage in the uw-plane, the sky that it is ob- 
serving is also essentially sparse and deconvolution algo- 
rithms are able to reconstruct a reasonable model of the 
point sources in the field of view. 

It is useful to construct an analytic approximation for 
the simulations discussed in the previous sections in order 
to foster a more intuitive understanding of the interac- 
tion between the array properties and the success of the 
foreground subtraction. Condon (1974), Perley & Erick- 
son (1984), and Artyukh (2003) provide a foundation for 
such a model, and Sault and Wayth [2006, 2007, MWA 
project documents] have adapted the results of these ef- 
forts for analyzing confusion noise in measurements of 
the MWA. 

As we outlined in Section 2.3, a dirty sky map produced 
by an interferometer is, neglecting calibration errors, the 
result of multiplying the true sky by the primary antenna 
tile beam and then convolving by the synthesized array 
beam. If the sky consists only of the faint extragalactic 
continuum sources with Poisson statistics, then the vari- 
ance in the intensity of the dirty sky map, ct|j, will be 
related to the variance in the true sky by 



'DK'^XT'Jy) 



2 



B {6x—0^,6y—9y)P {dx,Oy) dOxdOy 

(24) 

where ct^ is the variance in the intensity of the true sky, B 
describes the response of the synthesized array beam as 
a function of angle, and P describes the primary antenna 
tile beam. This equation can be simplified by assigning 
simple models for the response profiles of the synthesized 
array beam and the primary antenna tile beam. We take 
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the beams to be described by tophat functions, such that 
the response is defined to be one within a region of di- 
ameter Qb (for the synthesized array beam; Op, for the 
primary antenna tile beam). Outside this region, the re- 
sponse is taken to be Brms ^ 1 for the synthesized beam, 
and zero for the primary antenna tile beam. Thus, we 
include an allowance for the side lobes of the synthesized 
beam, but not for the primary beam. With these simpli- 
fications, the integrals in Equation 24 can be performed 
and the expression reduces to 



'D 



4(1 






y^i 



(25) 



0p is the solid 



where Cg is the variance due to the faint sources in our 
model and accounts for the extra variance due to the 
weak angular clustering, VIb ~ &% i^ ^^^ solid angle of 
the synthesized array beam, and Qp r 
angle of the primary antenna tile beam 

Inspecting this simplified form of Equation 24, we see 
that when the synthesized array beam has no side lobes, 
the variance in the dirty sky map is equal to that of 
the true sky map (after gridding). This is equivalent 
to complete coverage of the ww-plane and corresponds 
generally to the cases considered by Di Matteo et al. 
(2004); Zaldarriaga et al. (2004); Furlanetto & Briggs 
(2004); Gnedin & Shaver (2004); Santos et al. (2005); 
Wang et al. (2006); McQuinn et al. (2006); Gleser et al. 
(2008). But when side lobes are included in the model 
of the synthesized array beam, the variance in the dirty 
map is increased above that of the true sky map. For 
sparse coverage in the Mz;-plane, B^^^ will be given ap- 
proximately by the inverse of the number of independent 
measurements in the uv-p\a,ne. In the limiting case of a 
single, very short integration, (such as a "snapshot" ob- 
servation), the number of independent measurements is 
equal to the number of baselines. For the MWA, with its 
~125,000 instantaneous basehnes, S^ms ~ 10~^ in the 
worst-case, and is even better (lower) for long integra- 
tions when earth-rotation synthesis increases the num- 
ber of independent measurements. Using Gb ~ 5' and 
Qp w 30°, we can estimate for the MWA that 



Brms^p/^l 



1. (26) 

Thus, the variance due to the side lobes in the synthe- 
sized array beam is comparable in magnitude to the in- 
herent variance in the intensity of the true sky. A large 
field of view acts to oppose the advantages of dense cover- 
age of the uw-plane in this instance by introducing more 
variance into the dirty sky map. In this regime, in order 
to produce an estimate of the sky without the effects of 



the side lobes of the synthesized array beam, the full in- 
version of A is required since it cannot be approximated 
as a sparse matrix. 

Nevertheless, side lobe-induced noise in the dirty sky 
maps does not necessarily mean that the foreground sub- 
traction will be adversely affected. If the side lobe pat- 
terns were the same for each image plane in the data 
cube, then the spectrum along each pixel would simply 
receive a faint mirror contribution from the smooth spec- 
tra along all the other pixels. It is the variation with 
frequency of the side lobe-induced noise that can cause 
trouble in the planned foreground subtraction technique. 
As we found in Section 3, however, the density of the 
MWA uw-coverage effectively eliminates the frequency- 
dependence of the sidelobe structure of the array beam 
over the large range of scales where the uw-plane is com- 
pletely sampled at all frequencies. Although it does not 
appear to be necessary for the MWA, less densely sam- 
pled arrays could control the spectral dependence of their 
array beam sidelobes by masking out regions of the uv- 
plane that are not sampled at all frequencies and only 
processing regions that do have coverage in each spectral 
channel (as considered in Jelic et al. (2008) for the more 
sparsely sample LOFAR). 

Despite the increased susceptibility of long baselines 
to introducing undesirable fluctuations into the spectral 
domain, it is important to point out that they serve an 
important purpose. Long baselines are still expected to 
be necessary in order to peel away the bright sources 
in the field of view since the peeling process requires a 
precise knowledge of the position of each bright source, 
which is greatly improved with long baselines. 

It has been shown previously that the ideal baseline 
distribution for minimizing thermal uncertainty in mea- 
surements of the redshifted 21 cm ID power spectrum 
is extremely condensed (Bowman et al. 2006; Lidz et al. 
2007). In this work, we have further demonstrated that a 
compact array like the MWA also produces the dense uv- 
coverage necessary to efficiently subtract faint and diffuse 
foregrounds from dirty maps. 
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